Incorporating Alternate Translations into English Translation Treebank

نویسندگان

  • Ann Bies
  • Justin Mott
  • Seth Kulick
  • Jennifer Garland
  • Colin Warner
چکیده

New annotation guidelines and new processing methods were developed to accommodate English treebank annotation of a parallel English/Chinese corpus of web data that includes alternate English translations (one fluent, one literal) of expressions that are idiomatic in the Chinese source. In previous machine translation programs, alternate translations of idiomatic expressions had been present in untreebanked data only, but due to the high frequency of such expressions in informal genres such as discussion forums, machine translation system developers requested that alternatives be added to the treebanked data as well. In consultation with machine translation researchers, we chose a pragmatic approach of syntactically annotating only the fluent translation, while retaining the alternate literal translation as a segregated node in the tree. Since the literal translation alternates are often incompatible with English syntax, this approach allows us to create fluent trees without losing information. This resource is expected to support machine translation efforts, and the flexibility provided by the alternate translations is an enhancement to the treebank for this purpose.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Translations for Tagged Words: Extending the Translation Lexicon of an ITG for Low Resource Languages

We tackle the challenge of learning part-ofspeech classified translations as part of an inversion transduction grammar, by learning translations for English words with known part-of-speech tags, both from existing translation lexica and from parallel corpora. When translating from a low resource language into English, we can expect to have rich resources for English, such as treebanks, and smal...

متن کامل

Multi-Engine Machine Translation by Recursive Sentence Decomposition

In this paper, we present a novel approach to combine the outputs of multiple MT engines into a consensus translation. In contrast to previous Multi-Engine Machine Translation (MEMT) techniques, we do not rely on word alignments of output hypotheses, but prepare the input sentence for multi-engine processing. We do this by using a recursive decomposition algorithm that produces simple chunks as...

متن کامل

Multilingual Aligned Parallel Treebank Corpus Reflecting Contextual Information And Its Applications

This paper describes Japanese-English-Chinese aligned parallel treebank corpora of newspaper articles. They have been constructed by translating each sentence in the Penn Treebank and the Kyoto University text corpus into a corresponding natural sentence in a target language. Each sentence is translated so as to reflect its contextual information and is annotated with morphological and syntacti...

متن کامل

Application of Larson’s Method in English Translations of The Bustan of Sa‘di

 In this research, different English translations of Sa‘di’s Bustan were studied. An anecdote was selected randomly with its three English translations to identify whether or not the translators have managed to convey the messages of the original poem. The three selected translations were examined according to two of the criteria that Larson (1984) has proposed (accuracy and naturalness) for te...

متن کامل

Quality Assessment of English-into-Persian Translations of Tourism Management Academic Textbooks

This paper addresses the quality of the Persian translations of 32 English tourism textbooks. The qual- ity was assessed at sentence-level and page-level by the researchers and from the viewpoint of a tour- ism management student. In Phase 1, the quality of one randomly selected sentence from each text- book was assessed applying Hurtado Albir‘s analytical model; two were acc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014